A scalable system for microcalcification cluster automated detection in a 

distributed mammographic database 
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Abstract 

A computer-aided detection (CADe) system for microcalcification cluster identification in mammograms has been 
developed in the framework of the EU-founded MammoGrid project. The CADe software is mainly based on wavelet 
transforms and artificial neural networks. It is able to identify microcalcifications in different datasets of mammograms 
(i.e. acquired with different machines and settings, digitized with different pitch and bit depth or direct digital ones). 
The CADe can be remotely run from GRID-connected acquisition and annotation stations, supporting clinicians 
from geographically distant locations in the interpretation of mammographic data. We report and discuss the system 
performances on different datasets of mammograms and the status of the GRID-enabled CADe analysis. 

Keywords: Computer-aided detection, mammography, wavelets, neural networks, GRID applications. 



Introduction 



The EU-founded MammoGrid project [1] is currently col- 
lecting an European-distributed database of mammograms 
with the aim of applying the emerging GRID technolo- 
gies [2] to support the early detection of breast cancer. A 
GRID-based infrastructure would allow the resource shar- 
ing and the co- working between radiologists throughout the 
European Union. In this framework, epidemiological stud- 
ies, tele-education of young health-care professionals, ad- 
vanced image analysis and tele-diagnostic support (with 
and without computer-aided detection) would be enabled. 

In the image processing field, we have developed and im- 
plemented in a GRID-compliant acquisition and annotation 
station a computer-aided detection (CADe) system able to 
identify microcalcifications in different datasets of mammo- 
grams (i.e. acquired with different machines and settings, 
digitized with different pitch and bit depth or direct digital 
ones). 

This paper is structured as follows: the detection scheme 
is illustrated in sec. [H sec. [U describes the database the 
MammoGrid Collaboration has collected, whereas the tests 
carried out on different datasets of mammograms and the 
preliminary results obtained on a set of MammoGrid images 
are discussed in sec. [3] 



1 Description of the CADe system 

The CADe procedure we realized is mainly based on wavelet 
transforms and artificial neural networks. Our CADe sys- 
tem indicates one or more suspicious areas of a mammo- 
gram where microcalcification clusters are possibly located, 
according to the following schema [3] : 

• INPUT: digital or digitized mammogram; 

• Pre-processing: a) identification of the breast skin line 
and segmentation of the breast region with respect to 
the background; b) application of the wavelet-based 
filter in order to enhance the microcalcifications: 



• Feature extraction: a) decomposition of the breast re- 
gion in several NxN pixel- wide partially-overlapping 
sub-images to be processed each at a time; b) auto- 
matic extraction of the features characterizing each 
sub-image; 

• Classification: assigning each processed sub-images ei- 
ther to the class of microcalcification clusters or to that 
of normal tissue; 

• OUTPUT: merging the contiguous or partially overlap- 
ping sub-images and visualization of the final output 
by drawing the contours of the suspicious areas on the 
original image. 
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1.1 Pre-processing of the mammograms 

The pre-processing procedure aims to enhance the signals 
revealing the presence of microcalcifications, while sup- 
pressing the complex and noisy non-pathological breast 
tissue. A mammogram is usually dominated by the low- 
frequency information, whereas the microcalcifications ap- 
pear as high-frequency contributions. Microcalcifications 
show some evident features at some specific scales, while 
they are almost negligible at other scales. The use of the 
wavelet transform [4-6] allows for a separation of the more 
important high-resolution components of the mammogram 
from the less important low-resolution ones. 

Once the breast skin line is identified, the breast region is 
processed by the wavelet-based filter, according to the fol- 
lowing main steps: identification of the family of wavelets 
and the level up to which the decomposition has to be per- 
formed in order to highlight the interesting details; ma- 
nipulation of the wavelet coefficients (i.e. suppression of 
the coefficients encoding the low-frequency contributions 
and enhancement of those encoding the contributions of 
interesting details); inverse wavelet transform. By properly 
thresholding the wavelet coefficients at each level of the de- 
composition, an enhancement of the microcalcification with 
respect to surrounding normal tissue can be achieved in 
the synthesized image. In order to achieve this result, the 
wavelet basis, the level up to which the decomposition have 
to be performed and the thresholding rules to be applied to 
the wavelet coefficients have to be accurately set. All these 
choices and parameters are application dependent. The size 
of the pixel pitch and the dynamical range of the gray level 
intensities characterizing the mammograms are the most 
important parameters to be taken into account. 

1.2 Feature extraction 

In order to extract from a mammogram the features to be 
submitted to the classifier, small regions of a mammogram 
are analyzed each at a time. The choice of fragmenting the 
mammogram in small sub- images is finalized both to reduce 
the amount of data to be analyzed at the same time and to 
facilitate the localization of the lesions possibly present on 
a mammogram. The size of the sub-images has been cho- 
sen according to the basic rule of considering the smallest 
squared area matching the typical size of a small microcalci- 
fication cluster. Being the size of a single microcalcification 
rarely greater than 1 mm, and the mean distance between 
two microcalcifications belonging to the same cluster gen- 
erally smaller than 5 mm, we assume a square with a 5 
mm side to be large enough to accommodate a small clus- 
ter. This sub- image size is appropriate to discriminate an 
isolated microcalcification (which is not considered to be a 
pathological sign) from a group of microcalcifications close 
together. The length of the square side in pixel units is ob- 
viously determined by the pixel pitch of the digitizer or of 
the direct digital device. Let us assume that our choice for 
the length of the square side corresponds to N pixels. In 
order to avoid the accidental missing of a microcalcification 



cluster happening to be at the interface between two con- 
tiguous sub-images, we use the technique of the partially 
overlapping sub-images, i.e. we let the mask for selecting 
the sub-image to be analyzed move through the mammo- 
gram by half of the side length (N/2 pixels) at each horizon- 
tal and vertical step. In this way each region of a mammo- 
gram is analyzed more than once with respect to different 
neighboring regions. 

Each NxN pixel- wide sub-image extracted from the fil- 
tered mammogram is processed by an auto-associative neu- 
ral network, used to perform an automatic extraction of 
the relevant features of the sub-image. The implemen- 
tation of an auto-associative neural network is a neural- 
based method to perform an unsupervised feature extrac- 
tion [7-10]. This step has been introduced in the CADc 
scheme to reduce the dimensionality of the amount of data 
(the gray level intensity values of the NxN pixels of each 
sub-image) to be classified by the system. The architecture 
of the network we use is a bottle-neck one, consisting of 
three layers of TV 2 input, n hidden (where n <C N 2 ) and N 2 
output neurons respectively This neural network is trained 
to reproduce in output the input values. The overall acti- 
vation of the n nodes of the bottle-neck layer summarize 
the relevant features of the examined sub-image. The more 
the NxN pixel- wide sub-image obtained as output is close 
to the original sub-image provided as input, the more the 
activation potentials of the n hidden neurons are supposed 
to accommodate the information contained in the original 
sub-image. 

It is worth noticing that the implementation of an auto- 
associative neural network at this stage of the CADe scheme 
allows for a strong compression of the parameters represent- 
ing each sub-image (N 2 — > n) to be passed to the following 
step of the analysis. 

1.3 Classification 

We use the n features extracted by the auto-associative neu- 
ral network to assign each sub-image to either the class 
of sub-images containing microcalcification clusters or the 
class of those consisting only of normal breast tissue. A 
standard three-layer feed-forward neural network has been 
chosen to perform the classification of the n features ex- 
tracted from each sub-image. The general architecture 
characterizing this net consists in n inputs, h hidden and 
two output neurons, and the supervised training phase is 
based on the back-propagation algorithm. 

The performances of the training algorithm were eval- 
uated according to the 5x2 cross validation method [11]. 
It is the recommended test to be performed on algorithms 
that can be executed 10 times because it can provide a reli- 
able estimate of the variation of the algorithm performances 
due to the choice of the training set. This method consists 
in performing 5 replications of the 2-fold cross validation 
method [12]. At each replication, the available data are 
randomly partitioned into 2 sets (Ai and Bi for i = 1, . . . 5) 
with an almost equal number of entries. The learning al- 
gorithm is trained on each set and tested on the other one. 
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The system performances are given in terms of the sensi- 
tivity and specificity values, where the sensitivity is defined 
as the true positive fraction (fraction of malignant masses 
correctly classified by the system), whereas the specificity 
as the true negative fraction (fraction of benign masses 
correctly classified by the system). In order to show the 
trade off between the sensitivity and the specificity, a Re- 
ceiver Operating Characteristic (ROC) analysis has been 
performed [13, 14]. The ROC curve is obtained by plotting 
the true positive fraction versus the false positive fraction 
of the cases (1 - specificity), computed while the decision 
threshold of the classifier is varied. Each decision threshold 
results in a corresponding operating point on the curve. 

2 The MammoGrid distributed 
database 

One of the main goals of the EU-founded MammoGrid 
project is the realization of a GRID-enabled European 
database of mammogram, with the aim of supporting the 
collaboration among clinicians from different locations in 
the analysis of mammographic data. Mammograms in the 
DICOM [15] format are collected through the MammoGrid 
acquisition and annotation workstations installed in the 
participating hospitals. Standardized images are stored into 
the GRID-connected database. The image standardization 
is realized by the Standard-Mammogram-Form (SMF) algo- 
rithm [16] developed by the Mirada Solutions Company™, 
a partner of the MammoGrid project. The SMF provides 
a normalized representation of the mammogram, i.e. inde- 
pendent of the data source and of the acquisition technical 
parameters (e.g. mAs, kVp and breast thickness). 

The dataset of fully- annotated mammogram containing 
microcalcification clusters available at present to CADe de- 
velopers is constituted by 123 mammograms belonging to 
57 patients: 46 of them have been collected and digitized 
at the University Hospital of Udine (IT), whereas the re- 
maining 11 were acquired by the full- field digital mammog- 
raphy system GE Senographe 2000D at the Torino Hospital 
(IT); all have been stored in the MammoGrid database by 
means of the MammoGrid workstation prototype installed 
in Udine. 

3 Tests and results 

As the amount of mammograms collected at present in the 
MammoGrid database is too small for properly training 
the neural networks implemented in the characterization 
and classification procedures of our CADe, we used a larger 
dataset of mammograms for developing the system. Once 
the CADe has been trained and tested, we adapted it to 
the MammoGrid images and we evaluated its performances 
on the MammoGrid database. The dataset used for train- 
ing and testing the CADe was extracted from the fully- 
annotated MAGIC-5 database [17]. We used 375 mammo- 
grams containing microcalcification clusters and 610 normal 



Figure 1: Examples of the wavelet-based filter performances 
on tissues with different densities (top/bottom: origi- 
nal/filtered sub- images containing microcalcification clus- 
ters). 



mammograms digitized with a pixel pitch of 85 /jm and an 
effective dynamical range of 12 bit per pixel. 

3.1 Training and testing the CADe on the 
MAGIC-5 database 

To perform the multi-resolution analysis we considered the 
Daubechies family of wavelet [4], in particular the db5 
mother wavelet. The decomposition is performed up to 
the forth level. We found out that the resolution level 
1 mainly shows the high-frequency noise included in the 
mammogram, whereas the levels 2, 3 and 4 contain the 
high-frequency components related to the presence of mi- 
crocalcifications. Levels greater than 4 exhibit a strong cor- 
relation with larger structures possibly present in the nor- 
mal breast tissue. In order to enhance microcalcifications, 
the approximation coefficients at level 4 and the detail co- 
efficients at the first level were neglected. By contrast, the 
statistical analysis of the distributions of the remaining de- 
tail coefficients lead us to keep into account for the synthesis 
procedure only those coefficients whose values exceed 2a, 
where a is the standard deviation of the coefficient distri- 
bution at that level. Some examples of the performance of 
the filter on mammographic images containing microcalcifi- 
cation clusters embedded in tissues with different densities 
are shown in fig. [TJ 

The training and testing of the auto-associative neural 
network has been performed on a dataset of 149 mammo- 
grams containing microcalcification clusters and 299 normal 
mammograms. The size N of the sub-images to be analyzed 
by this neural network has been chosen as N = 60, thus 
corresponding to a physical region of 5.1x5.1 mm 2 . The 
number n of units in the hidden layer has been fixed ac- 
cording to the requirement of having the minimum number 
of neurons allowing for a good generalization capability of 
the system. Assigning too much neurons to the hidden layer 
would facilitate the convergence of the learning phase, but 
it could reduce the generalization capability of the network. 
Moreover, a too populated hidden layer could set too strin- 
gent limits on the minimum number of patterns needed for 
training the neural classifier implemented in the following 
step of the analysis. By contrast, a too small hidden layer 
would lead to the saturation of some of the hidden units and 
thus negatively affect the overall performance of the system. 
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Figure 2: Mean squared errors on the train and test sets in 
the learning phase of the auto-associative neural network: 
the minimum error on the test set is reached between 80 
and 90 epochs. 



A good compromise between these two opposite trends has 
been reached by assigning 80 units to the hidden layer. The 
network architecture is thus fixed to be: 3600 input, 80 hid- 
den and 3600 output neurons. The algorithm used in the 
training procedure is the standard back-propagation with 
momentum and the activation function is a sigmoid. Wc 
used a learning rate of 0.4 and a momentum of 0.2. The 
behavior of the mean squared error computed during the 
learning procedure at each epoch on the train set and every 
ten epochs on the test set is shown in fig. [5J The training 
phase has been stopped once the error on the test set has 
reached the minimum value (early stop). As shown in fig.[H 
it happens between epochs 80 and 90. The training phase 
was thus forced to finish in 85 epochs. 

The dataset used for the supervised training of the feed- 
forward neural classifier is constituted by 156 mammograms 
with microcalcification clusters and 241 normal mammo- 
grams. The standard back-propagation algorithm was im- 
plemented and the best performance were achieved with 10 
neurons in the hidden layer. The performances our learning 
algorithm achieved according to the 5x2 cross-validation 
method are reported in tab. [T] in terms of the sensitivity 
and specificity values. As can be noticed, the performances 
the neural classifier achieves are robust, i.e. almost inde- 
pendent of the partitioning of the available data into the 
train and test sets. The average performances achieved in 
the testing phase are 93.4% for the sensitivity and 91.8% 
for the specificity. 

Once each sub-image of a mammogram has been assigned 
a degree of suspiciousness, the contiguous or partially- 
overlapping suspicious sub-images have to be merged in 
order to evaluate the system performances on the entire 
mammographic images. A cluster detection criterion has 
to be a priori defined. The effect the choice of the detec- 
tion criteria in addition to the size of the annotated region 
has on the CAD performance evaluation have been system- 
atically examined in the literature [18,19]. As there is no 



Table 1: Evaluation of the performances of the standard 
back-propagation learning algorithm for the neural classifier 
according to the 5x2 cross validation method. 
Train Set Test Set Sensitivity (%) Specificity (%) 
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Figure 3: FROC curve obtained on a test set of 140 mam- 
mograms (70 containing 89 microcalcifications clusters and 
70 normal views) extracted from the MAGIC-5 database. 



universal scoring method currently in use for evaluating the 
performances of a CAD system for microcalcification clus- 
ter detection, we briefly describe the detection criteria we 
adopted: 

• a true cluster is considered detected if the region indi- 
cated by the system includes two or more microcalcifi- 
cations located within the associated truth circle; 

• all findings outside the truth circle are considered as 
false positive (FP) detections. 

The CADe performances were globally evaluated on a test 
set of 140 images of the MAGIC-5 database (70 with micro- 
calcification clusters and 70 normal images) in terms of the 
free-response operating characteristic (FROC) analysis [20] 
(see fig. [3]). The FROC curve is obtained by plotting the 
sensitivity of the system versus the number of FP detec- 
tion per image (FP/im), while the decision threshold of the 
classifier is varied. In particular, as shown in the figure, a 
sensitivity value of 88% is obtained at a rate of 2.15 FP/im. 
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Figure 4: Examples of the performances of the scaling pro- 
cedure for the CADe filter. 

3.2 Testing the CADe on the MammoGrid 
database 

The CADe system we developed and tested on the MAGIC- 
5 database has been adapted to the MammoGrid SMF im- 
ages by using the following procedure: 

• the wavelet-based filter has been tuned on the SMF 
mammograms; 

• the remaining steps of the analysis, i.e. the neural- 
based characterization and classification of the sub- 
images have been directly imported from the MAGIC-5 
CADe software. 

According to the MammoGrid project work- flow [1], the 
CADe algorithm has to run on mammograms previously 
processed by the SMF software [16]. The SMF mammo- 
grams are characterized by a different pixel pitch (100 /im 
instead of 85 fim) and a different effective dynamical range 
(16 bit per pixel instead of 12) with respect to the MAGIC-5 
mammograms. A microcalcification digitized with a 85/im 
pixel pitch scanner appears bigger (in pixel units) with re- 
spect to the same object digitized with a 100/im pixel pitch. 
Therefore, the filter to be applied to the MammoGrid mam- 
mograms is required to be sensitive to smaller object. A 
different choice in the range of scales to be considered in 
the analysis has proved to be comfortable for accommodat- 
ing the difference in the pixel pitch. Once the matching 
of the effective dynamical ranges of the two databases has 
been performed, the wavelet decomposition is performed up 
to level 3 instead of 4, being the details at level 4 too big 
to be correlated to microcalcifications. Only the details at 
levels 2 and 3 (exceeding 2cr of the experimental distribu- 
tion) are kept into account for the synthesis. A test of this 
scaling procedure has been performed on the mammograms 
of 15 patients acquired both by the MAGIC-5 and by the 
MammoGrid acquisition workstations. As shown in fig. U] 
the matching of the dynamical ranges and the scaling of 
the wavelet-analysis parameters allows the CADe filter to 
generate very similar processed images. 



The performances the rescaled CADe achieves on the im- 
ages of the MammoGrid database are the following: a sen- 
sitivity of 82.2% is obtained at a rate of 4.15 FP/im. If the 
analysis is performed independently on the digitized and 
on the direct digital images, the results are: a 82.1% sen- 
sitivity at a rate of 4.8 FP/im in the first case, whereas 
a 83.3% sensitivity at a rate of 1.6 FP/im in the second 
case. As can be noticed, the number of FP detection per 
image in the case of digitized images is appreciably higher 
with respect to the corresponding rate for the directly dig- 
ital images. Despite the SMF algorithm performs a sort 
of normalization of images acquired in different conditions, 
the digitized images are intrinsically noisier. A compari- 
son with the FROC obtained on the MAGIC-5 database 
reported in fig. [3] points out that the overall CADe system 
performances in the case of the MammoGrid database are 
not as good as those obtained on the MAGIC-5 dataset. 
One possible explanation for this decrease in sensitivity, 
is that the MammoGrid database contains already a large 
number of non-easily detectable cases. In this case an im- 
provement of the CADe performances would be achieved 
once the database is enlarged. 

4 Conclusion 

We developed a CADe system for microcalcification cluster 
identification suitable for different sets of data (digitized 
or direct digital, acquired with different acquisition param- 
eters, etc.). This CADe system has been developed and 
tested on the MAGIC-5 database and then adapted to the 
MammoGrid database of SMF mammograms by re-scaling 
some of the wavelet-filter parameters. This choice is moti- 
vated by two main reasons: the amount of fully-annotated 
SMF images containing microcalcification clusters available 
at present to the MammoGrid CADe developers is not large 
enough to perform a new training of the neural networks im- 
plemented in the characterization and classification proce- 
dures; moreover, the visual aspect of the filtered sub-images 
in the case both of MAGIC-5 images and SMF images is 
actually very similar. This makes us confident that the gen- 
eralization capability of the neural networks would account 
for the difference in resolution of the two original datasets. 
The scaling procedure we developed has two main advan- 
tages: the wavelet filter is the only part of the analysis one 
has to tune on the characteristics of a new dataset, whereas 
the neural-based characterization and classification proce- 
dures do no need to be modified; this scalable system can 
be tested even on very small databases not allowing for the 
learning procedure of the neural networks to be properly 
carried out. 

The preliminary results obtained on MammoGrid 
database are encouraging. Once the planned increase in the 
population of the database is realized, a complete and more 
robust test of the CADe performance on the pan-European 
MammoGrid database would be carried out. 

The CADe software is currently available on the GRID- 
connected acquisition and annotation workstation proto- 
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types installed in the Hospitals of the MammoGrid Con- 
sortium. The CADe can be remotely executed on the dis- 
tributed database and the clinical evaluation of the CADe 
as second reader of screening mammograms has already 
started. 
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